14 research outputs found

    Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation

    Full text link
    Feature selection (FS) has become an indispensable task in dealing with today's highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary simultaneous perturbation stochastic approximation (BSPSA). This pseudo-gradient descent stochastic algorithm starts with an initial feature vector and moves toward the optimal feature vector via successive iterations. In each iteration, the current feature vector's individual components are perturbed simultaneously by random offsets from a qualified probability distribution. We present computational experiments on datasets with numbers of features ranging from a few dozens to thousands using three widely-used classifiers as wrappers: nearest neighbor, decision tree, and linear support vector machine. We compare our methodology against the full set of features as well as a binary genetic algorithm and sequential FS methods using cross-validated classification error rate and AUC as the performance criteria. Our results indicate that features selected by BSPSA compare favorably to alternative methods in general and BSPSA can yield superior feature sets for datasets with tens of thousands of features by examining an extremely small fraction of the solution space. We are not aware of any other wrapper FS methods that are computationally feasible with good convergence properties for such large datasets.Comment: This is the Istanbul Sehir University Technical Report #SHR-ISE-2016.01. A short version of this report has been accepted for publication at Pattern Recognition Letter

    Data mining applications in social lending and anchorage planning

    Get PDF
    Tezin basılısı İstanbul Şehir Üniversitesi Kütüphanesi'ndedir

    Global Air Quality and COVID-19 Pandemic : Do We Breathe Cleaner Air?

    Get PDF
    The global spread of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has challenged most countries worldwide. It was quickly recognized that reduced activities (lockdowns) during the Coronavirus Disease of 2019 (COVID-19) pandemic produced major changes in air quality. Our objective was to assess the impacts of COVID-19 lockdowns on groundlevel PM2.5, NO2, and O-3 concentrations on a global scale. We obtained data from 34 countries, 141 cities, and 458 air monitoring stations on 5 continents (few data from Africa). On a global average basis, a 34.0% reduction in NO2 concentration and a 15.0% reduction in PM2.5 were estimated during the strict lockdown period (until April 30, 2020). Global average O-3 concentration increased by 86.0% during this same period. Individual country and continent-wise comparisons have been made between lockdown and business-as-usual periods. Universally, NO2 was the pollutant most affected by the COVID-19 pandemic. These effects were likely because its emissions were from sources that were typically restricted (i.e., surface traffic and non-essential industries) by the lockdowns and its short lifetime in the atmosphere. Our results indicate that lockdown measures and resulting reduced emissions reduced exposure to most harmful pollutants and could provide global-scale health benefits. However, the increased O-3 may have substantially reduced those benefits and more detailed health assessments are required to accurately quantify the health gains. At the same, these restrictions were obtained at substantial economic costs and with other health issues (depression, suicide, spousal abuse, drug overdoses, etc.). Thus, any similar reductions in air pollution would need to be obtained without these extensive economic and other consequences produced by the imposed activity reductions.Peer reviewe

    Risk assessment in social lending via random forests

    No full text
    With the advance of electronic commerce and social platforms, social lending (also known as peer-to-peer lending) has emerged as a viable platform where lenders and borrowers can do business without the help of institutional intermediaries such as banks. Social lending has gained significant momentum recently, with some platforms reaching multi-billion dollar loan circulation in a short amount of time. On the other hand, sustainability and possible widespread adoption of such platforms depend heavily on reliable risk attribution to individual borrowers. For this purpose, we propose a random forest (RF) based classification method for predicting borrower status. Our results on data from the popular social lending platform Lending Club (LC) indicate the RF-based method outperforms the FICO credit scores as well as LC grades in identification of good borrowers

    Reliability and stability of a statistical model to predict ground-based PM2.5 over 10\ua0years in Karachi, Pakistan, using satellite observations

    No full text
    Understanding the complex mechanisms of climate change and its environmental consequences requires the collection and subsequent analysis of geospatial data from observations and numerical modeling. Multivariable linear regression and mixed-effects models were used to estimate daily surface fine particulate matter (PM2.5) levels in the megacity of Pakistan. The main parameters for the multivariable linear regression model were the 10-km-resolution satellite aerosol optical depth (AOD) and daily averaged meteorological parameters from ground monitoring (temperature, dew point, relative humidity, wind speed, wind direction, and planetary boundary layer height). Ground-based PM2.5 was measured in two stations in the city, Korangi (industrial/residential) and Tibet Center (commercial/residential). The initial linear regression model was modified using a stepwise selection procedure and adding interaction parameters. Finally, the modified model showed a strong correlation between the PM2.5–satellite AOD and other meteorological parameters (R2 = 0.88–0.92 and p-value = 10−7 depending on the season and station). The mixed-effect technique improved the model performance by increasing the R2 values to 0.99 and 0.93 for the Korangi and Tibet Center sites, respectively. Cross-validation methods were used to confirm the reliability of the model to predict PM2.5 after 10\ua0years

    The impact of frying aerosol on human brain activity

    No full text
    Knowledge on the impact of the exposure to indoor ultrafine particles (UFPs) on the human brain is restricted. Twelve non-atopic, non-smoking, and healthy adults (10 female and 7 male, in average 22 years old) were monitored for brain physiological responses via electroencephalographs (EEGs) during cooking. Frying ground beef meat in sunflower oil using electric stove without ventilation was conducted. UFPs, particulate matter (PM) (PM1, PM2.5, PM4, PM10), CO2, indoor temperature, RH, oil and meat temperatures were monitored continuously throughout the experiments. The UFP peak concentration was recorded to be approximately 2.0 × 105 particles/cm3. EEGs were recorded before exposure, at end of cooking when PM peak concentrations were observed, and 30 min after the end of the cooking session (post-exposure). Brain electrical activity statistically significantly changed during post-exposure compared to the before exposure, suggesting the translocation of UFPs to the brain, occurring solely in the frontal and temporal lobes of the brain. Study participants older than 25 were more susceptible to UFPs compared to those younger than 25. Also, the brain abnormality was mainly driven by male rather than female study participants. The brain slow-wave band (delta) decreased while the fast-wave band (Beta3) increased similar to the pattern found in the literature for the exposure to smoking fumes and diesel exhaust

    Human exposure to aerosol from indoor gas stove cooking and the resulting nervous system responses

    No full text
    Our knowledge of the effects of exposure to indoor ultrafine particles (sub-100 nm, #/cm3) on human brain activity is very limited. The effects of cooking ultrafine particles (UFP) on healthy adults were assessed using an electroencephalograph (EEGs) for brain response. Peak ultrafine particle concentrations were approximately 3 × 105 particle/cm3, and the average level was 1.64 × 105 particle/cm3. The average particle number emission rate (S) and the average number decay rate (a+k) for chicken frying in brain experiments were calculated to be 2.82 × 1012 (SD = 1.83 × 1012, R2 = 0.91, p = 0.0013) particles/min, 0.47 (SD = 0.30, R2 = 0.90, p < 0.0001) min−1, respectively. EEGs were recorded before and during cooking (14 min) and 30 min after the cooking sessions. The brain fast-wave band (beta) decreased during exposure, similar to people with neurodegenerative diseases. It subsequently increased to its pre-exposure condition for 70% of the study participants after 30 min. The brain slow-wave band to fast-wave band ratio (theta/beta ratio) increased during and after exposure, similar to observed behavior in early-stage Alzheimer's disease (AD) patients. The brain then tended to return to its normal condition within 30 min following the exposure. This study suggests that chronically exposed people to high concentrations of cooking aerosol might progress toward AD
    corecore